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Abstract 

We describe a mechanical device which can be used as an analog com- 
puter to solve the transportation problem. In practice this device is simu- 
lated by a numerical algorithm. Tests show that this algorithm is 60 times 
faster than a current subroutine (NAG library) for an average 1000 x 1000 
problem. Its performance is even better for degenerate problems in which 
the weights take only a small number of integer values. 
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1 Introduction 



We describe here an algorithm for the solution of the transportation problem 
M (also known as the Hitchcock problem). 

The development of this algorithm had its origin in studies of the lattice 
gas method for three-dimensional fluid simulations (§) . The optimization of the 
collision table has generally the form of a transportation problem || [l2| , with 
large cost matrices. Classical algorithms were found to require prohibitively 
long computing times. Therefore an attempt was made to devise a method 
which would take advantage of the peculiarities of the lattice gas problem. This 
method then turned out to be of general applicability. 

The present algorithm was developed independently of the already published 
studies of the transportation problem and related optimization problems. This 
was not planned; it only reflects the way things happened, and the ignorance 
of this author who comes from a rather different field. More will be said about 
this in Section ^|. 

The paper is organized as follows. Section || defines the problem. In Sec- 
tion we describe a mechanical device which can be used as an analog computer 
to solve the transportation problem. In Section || we develop an appropriate 
graph representation and a numerical scheme which simulates the mechanical 
model. This is illustrated by a detailed example in Section |[ In Section ^, we 
give a rigorous definition and justification of the algorithm. Section |?j describes 
some aspects of the computer implementation. In Section |8|, the algorithm is 
compared with the NAG library subroutine for the solution of the transporta- 
tion problem. In the particular case of the assignment problem, comparisons are 
also made with the algorithm of Burkard and Derigs [Q . A few comments are 
made in Section ^. Finally, an Appendix derives some bounds on the number 
of operations. 



We are given a matrix and two vectors a,i > 0, bj > 0. The index i runs from 
1 to m and the index j runs from 1 to n. There is 



2 The problem 
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(1) 



The problem is to find coefficients fij which maximize the sum 




(2) 



j 



subject to the constraints 



22h = a t (i = l,...,m), (3) 

3 

^2 fa = h (j = l,...,n), (4) 

i 

fa > 0. (5) 

A set of fij which satisfies the constraints (|J) to (|J) will be called a feasible 
solution. A feasible solution which maximizes (Q) will be called an optimal 
solution. 

Notes: (i) This is essentially the transportation problem. It can be reduced 
to the standard form given for instance in jlll Section 7.4], by defining 

C SU p Cij , (6) 

where c sup is a constant satisfying 

c sup > maxcy. (7) 

(ii) There is no sign condition on the Cjj, which can be positive or negative. 

(iii) If cij = for one row i, then from fl) and (||) we have fij = for all j. 
This row does not contribute to the sum (||); the values of the Cjj on that row 
are irrelevant. Thus this row could be eliminated without changing the problem. 
The same holds if bj = for some j. We could therefore in principle restrict our 
attention to the case where the ai and bj are strictly positive, as is usually done 
[q 0- I n practice, however, it is convenient to be able to include the cases 
with some = and/or bj — into the general treatment. The algorithm to 
be described works just as well in such cases. 

(iv) The ai, bj, can be integer or real numbers. 

(v) A special case of the transportation problem is 

Oi = 1 (i = l,...,m), bj = l (j = l,...,n). (8) 
From (Q) it follows then that 

m = n. (9) 
If we prescribe the additional constraint 

fa = or 1, (10) 

we obtain the classical assignment problem (Jll], chap. 11]). 
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3 A mechanical model 



We show now that it is possible to build a simple mechanical device which solves 
the transportation problem. This device acts as a analog computer, the numbers 
entering the problem are represented by physical quantities, and the equations 
are replaced by physical laws. 

We define a system of axes x, y, z in physical space (Fig. [I]). 




Figure 1: An analog computer for the solution of the transportation problem. 

To every value of i is associated a rod Ai parallel to the y axis, which we will 
call a row by reference to the cy matrix. Mechanical constraints (not shown on 
the figure) ensure that each row can only move in the vertical direction. More 
precisely, each row remains parallel to the y axis and moves in a fixed vertical 
plane x = Const. The variable height z of the lower face of row Ai will be 
designated by a*. Row Ai has a weight a, and thus is subjected to a force 
towards the negative z axis. 
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Similarly, to every value of j is associated a rod Bj parallel to the x axis, 
which we will call a column. (Notice that these "columns" are horizontal in the 
three-dimensional physical space !). Each column is constrained to move only 
vertically: it remains parallel to the x axis and moves in a fixed vertical plane 
y = Const. It has a negative weight — bj (or, if one prefers, a buoyancy bj) and 
thus is subjected to a force bj towards the positive z axis. (In an actual model, 
this might be realized with cables and counterweights). We call z = f3j the 
height of its upper face. 

Finally, small vertical cylinders or studs of height Cy and of negligible weight 
are placed on the columns, in such a way that each stud enforces a minimal 
distance between row Ai and column Bj-. 



(Note: this description seems to imply that c,j > 0. Actually it is possible, 
although mechanically more awkward, to have negative values of the cy by 
bending the rods. One can also make all c,j positive by adding a sufficiently 
large constant to all of them. Therefore we continue to consider that the can 
be arbitrary.) 

The potential energy of the system is, within an additive constant: 



Initially, all rods are maintained at a fixed position by two additional fixed 
rods P and Q acting as stops (Fig. [j]), with the rows Ai well above the columns 
Bj, so that there is no contact between the rows and the studs. For instance 
we take j3j = (j = 1, . . . , n) and ai — c SU p (i — 1, . . . , to). Then the rods are 
released by removing the stops P and Q, and the system starts evolving. Rows 
go down, columns go up, and contacts are made with the studs. Aggregates of 
rows and columns are progressively formed. As new contacts are made, these 
aggregates are modified. Thus a complex evolution may take place. 

It will be convenient to imagine that the system is immersed in a viscous 
fluid, so that the velocity of an object, rather than its acceleration, is propor- 
tional to the force to which it is subjected. More specifically, let us define 
formally an aggregate as a subset S of rows and columns which are in contact 
(they form a connected set in space) and move with the same velocity. We also 
assume that it is a maximal connected set: no other row or column touches it. 
The downward force acting on this aggregate is its total weight, which we call 



&i - Pj > Cij. 



(11) 




(12) 



j 



M(S): 




(13) 



G 



We will simply assume that the aggregate moves with a velocity 

| - -M(S). ,14, 

In particular an aggregate remains motionless if the force applied to it vanishes. 

It is intuitively clear that the system will reach an equilibrium, an in fact 
we have 

Theorem 1 An equilibrium is reached after a finite time. 
Proof: an aggregate S has a potential energy 

u(s)= J2 «*«<- E Wi- ( 15 ) 

From (|l^) and ([lj) we find that this potential energy decreases with time ac- 
cording to 

^ = -M\S). (16) 

We call |M| m i n the minimum of all non-zero values of l-M^S 1 )], over all subsets 
S of the full set of rods. Since there is only a finite number of subsets, we have 
|M| min > 0, and: 

either dU(S)/dt = 0, , , 

or dU(S)/dt < -\M\l in . (i7J 

The total potential energy U is the sum of the potential energies of the aggre- 
gates. Therefore we also have 



either dll/dt = 0, 
or dU/dt < -\M\ 2 mi 



(18) 



The first case is realized only if dU(S)/dt = for every aggregate, i.e. if all 
aggregates are motionless. Thus: either the system is in equilibrium, or its 
potential energy decreases at a rate at least equal to |M|^j n . 
On the other hand we have 



f3j > mincy . (19) 



Multiplying by aibj, summing on i and j, and using (Q), we obtain 

U E a i ^ min c »i (E a ' '- 20 -' 



This gives a lower bound for U. 

Combining these results, wc find that the system reaches an equilibrium 
after a finite time (for which an upper bound is easily derived). ■ 

We consider now such an equilibrium state. We will show that 
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Theorem 2 If the system is in equilibrium, and if Fij is the force transmitted 
through studcij from row Ai to column Bj, then fij — Fij is an optimal solution 
of the transportation problem. 

Proof: (i) Each row is in equilibrium, therefore 

J2 F a = a i- ( 21 ) 

3 

(ii) Each column is in equilibrium, therefore 

>>,- (22) 

i 

(iii) cannot be negative since it is transmitted by contact: 

Fa > 0. (23) 

Therefore F^ is a feasible solution. 

(iv) If F^ > 0, row Ai is in contact with column Bj, and therefore 

ai -Pj=Cij. (24) 

It follows that 

F i j(a i -l3j-Ci j )=Q (25) 
Summing ( p5| ) over i and j and using @ and @, we obtain 

E "<"< - E Wi -EE F ^ = °- r2iU 

i 3 i j 

Consider another feasible solution From (||) and ([ll]) we have 

flj(ai-Pj-Cij)>0 Vi,j (27) 
and therefore, summing over i and j and using (g) and (||) : 

E "<"< - E ^ "EE fiw ■ °- ( 28 ) 

Comparing with (p6|), we have 

EE EE''-'-" (M) 

which shows that Fij is optimal. ■ 

Incidentally, (^6[) shows that the "cost" ^ FijCij of the optimal solution 
is equal to the potential energy U of the corresponding equilibrium. 
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4 Numerical simulation 



4.1 Method 

The analog computer described in the previous Section could be built in prin- 
ciple, but for large values of m and n this would be impractical. Instead, we 
will simulate on a digital computer the behaviour of the analog computer, as it 
progressively settles into an equilibrium. 

It would be possible to simulate the evolution of the mechanical system as 
described in the previous Section, i.e. to remove suddenly the two stops P 
and Q and let the system evolve freely until it has found an equilibrium. In 
that case, however, all rods would interact more or less simultaneously, and the 
simulation would be somewhat complex; essentially we would have to solve an 
N-body problem. It turns out to be simpler and also more efficient to guide 
the system through a more controlled and orderly evolution. This is permitted 
because, as shown by Theorem |[ all we need in order to solve the transportation 
problem is to find an equilibrium; how we arrive at it is irrelevant. 

Many algorithms can be imagined. Here we will only describe one of the 
simplest methods, which was found to work well, although it is not always the 
most efficient in terms of computing time (see below Section Q). We give here 
an informal description of the algorithm, based on physical intuition; a rigorous 
derivation will be presented in Section |(| 

The stops P and Q are not removed. Instead, the stop P is held fixed during 
the whole process, and the stop Q is slowly lowered from its initial position. 
The velocity of descent is smaller than the minimal non-zero velocity of any 
aggregate, |M| m ; n , so that the evolution is fully controlled by the motion of the 
stop Q. At any given time the system is in quasi-equilibrium: if Q stops then 
nothing moves anymore. 

The evolution of the system will be studied in detail below. It ends when 
the whole system of rows and columns comes to rest. The stop Q, continuing 
its descent, ceases then to be in contact with any row and can be removed. The 
force exerted by the stop P on any column still in contact with it is then zero, 
and that stop can also be removed. Thus an equibrilium has been reached, from 
which the optimal solution can be read. 

4.2 Graph representation 

The state of the system at any given time can be conveniently represented by a 
graph, as follows. Each rod is a node of the graph; we represent rows by squares 
and columns by circles. An edge always joins a square to a circle: the graph is 
bipartite. An edge is present between row i and column j when the row and the 
column are in contact through the stud Cy, i.e. when 

ati - 0j = c^. (30) 
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The stops are also represented by nodes, and their contacts with rods are simi- 
larly represented by edges. In order to preserve the bipartite property, the stop 
P must then be a square while the stop Q is a circle. As an illustration, Fig. || 
represents the initial state of the system, before any row has been lowered. Note 
that the graph is a purely topological representation: the position of a symbol 
in the graph has nothing to do with the position of the corresponding rod in 
physical space. 




Figure 2: Graph of the initial state. 

The force transmitted downwards from a row to a column, or from a row 
to the Q stop, or from the P stop to a column, can be written beside the 
corresponding edge. In the initial state, only the last two kinds of forces are 
present; they have the values indicated on Fig. ||. Note that this force is always 
positive or zero. At any given time in the procedure, each row is in equilibrium; 
therefore the sum of the forces emanating from it must equal the absolute value 
of its weight a^. Similarly, each column is in equilibrium, and the sum of the 
forces received by it must equal its buoyancy bj . 

Can this graph have cycles ? A cycle will be an even sequence of alternating 
rows and columns: 

h,ji,i2,h,---,ip,j P - (31) 

From ( fio| ) we have then 

c iljl ~ c ilj\ C ilh ~ c i'ijl + • • • + c i p j p ~ c iij P = 0- (32) 

In order to simplify the exposition, we make the following assumption (which 
will be removed in Section ^) : 

Assumption 1 The Cy are such that there are no linear relations of the form 
( pl[ ) between them. 

Then the graph has no cycles and is a forest, or a collection of trees. Each 
tree corresponds to an aggregate as defined in Section |^. 

It will be convenient to assume also, for the time being, that no subset of 
rows and columns can be in equilibrium. This can be expressed as: 
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Assumption 2 Let I be a subset of {1, ... , m} and J a subset of {1, ... ,n}. 
The relation 

fei jeJ 

is true only in two cases: (i) I = J = 0; (ii) I = {1, . . . , to} and J = {1, . . . ,n}. 

Note that this excludes in particular a, = or 6j = 0: rows and columns 
must have positive weights and buoyancies. 

We remark that these two assumptions are satisfied in principle in the generic 
case, when the cij, bj, dj are real numbers with arbitrary values. In practice, 
however, these numbers are often integers with a restricted range and the as- 
sumptions are frequently violated. 

It will be proved in Section ^| that the graph always consists of two trees, 
except at particular instants of time where they fuse into a single tree (see below 



Section 4.3). One of them contains Q and will be called moving tree. The other 
contains P and will be called fixed tree. We will represent the two trees with 
the usual hierarchical representation of trees (||, Section 2.3), taking the stop 
as root for each tree. Fig. ^ shows an example. 

It will be convenient to extend the usual terminology of parents and children 
by specifying that rows are male and columns are female. The stop Q is female 
and the stop P is male. To recapitulate: 



row 
stop P 

column 
stop Q 



male, 



female. (34) 



In each tree, sex is thus alternating from one generation to the next. A row 
has one mother and any number of daughters, while a column has one father 
and any number of sons. The stops themselves have no parents. 

4.3 Contact and rearrangement 

As the stop Q goes down, the rows and columns which belong to the moving 
tree move with it. This continues until a contact is made between the two 
trees. Since rows always remain above columns in physical space, this contact 
happens necessarily between a moving row Ai a and a fixed column Bj c . A new 
edge is created and the two trees are temporarily fused into a single tree. A 
single contact is made, because two simultaneous contacts would imply a cycle, 
in contradiction to Assumption [l]. Fig. || shows an example, posterior in time to 
Fig. ||, where contact is made between the row A 3 and the column B\. (Note: 
in the example of Fig. ^, the row and the column which make contact happen 
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A 4 



Figure 3: Example of a graph. Left: moving tree. Right: fixed tree. 
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to be at the same level in their respective trees; this need not be so in general.) 




Figure 4: Contact between the moving and the fixed tree, and readjustment of 
the forces. Here the numbers +A and —A represent the changes in the forces, 
rather than the forces themselves. 

As a result of the contact, the forces change. To understand what happens, 
it is convenient to imagine that the studs are slightly elastic, so that the change 
does not happen all at once, but progressively over a small interval of time. The 
force A along the newly created edge, which is initially zero, increases as the 
moving row moves down. This induces other changes in neighbouring edges. 
We consider the path from Q to P made by (i) the path from Q to Ai c in the 
moving tree; (ii) the newly created edge from Ai a to Bj c ; (iii) the path from Bj c 
to P in the fixed tree. We call this the main path. In Fig. |[ for instance, the 
main path is QA5B4A3B1A1B3P. The graph can then be viewed as made up of 
the main path, plus a number of lateral branches. The lateral branches are not 
involved in the readjustment of the forces; each of them is attached to the main 
path by a single edge and the force along that edge equals the weight or the 
buoyancy of the branch, which does not change. Therefore, only forces along 
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the main path can change. Each node must remain in equilibrium; therefore all 
changes have the same modulus A and alternate in sign along the main path, 
as shown by Fig. || (Note in particular that the forces of contact with the two 
stops decrease). 

This continues until one of the decreasing forces becomes zero. (Two forces 
cannot vanish simultancoulsy, because the intermediate tree would have zero 
weight, in contradiction to Assumption^). The chain breaks then at the corre- 
sponding edge, which disappears, and we have again two separate trees. Thus, 
the whole episode ends in a capture of a part of one tree by the other. The 
capture can occur in either direction, depending on where is the weakest link 
of the main path. For instance if the weakest link in Fig. [| is between B\ and 
At, the branch of the fixed tree with head B\ is captured by the moving tree 
and we obtain Fig. ||. B\ ceases to be in contact with A\ as the moving tree 
continues its descent. 

The new moving tree continues to go down with the stop Q. Eventually a 
new contact is made, and one of the trees captures a part of the other. This 
goes on until the moving tree is reduced to the stop Q alone. It can be shown 
that this always happens after a finite number of captures (see Appendix |A]) . 
Only the fixed tree remains, now containing all rows and all columns, and we 
have the sought equilibrium. 
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5 Example 



We exhibit here the step-by-step progress of the algorithm on a simple example 
with m = 3, n = 4. Table ^ shows the values of the given coefficients a,, bj, and 
Cij. Note that the condition (Q) is verified. 



Table 1: Values of the parameters a% (left column), bj (top row), and Cij for the 
example problem. 





44 


52 


13 


37 


86 


26 


64 


33 


62 


4 


63 


27 


13 


14 


56 


94 


4 


4 


52 



When executing the algorithm by hand, it is convenient to keep track of the 
distances between the rows and the studs, i.e. the quantities 

Jij = cn - f3j - c i3 . (35) 

One can then easily determine where the next contact will take place. The 
distances jij are shown on the left in Fig. [| while the graph (moving tree 
and fixed tree) is shown on the right. Only the indices i or j of the rows and 
columns are indicated; the type is indicated by the symbol (square for a row, 
circle for a column). Evolution proceeds from top to bottom; successive steps 
are represented in lines labelled a, b, c, .... Lines b, d, f, . . . , correspond to a 
descent of the moving tree; the distances change, while the forces and the trees 
remain fixed. Conversely, lines c, e, g, . . . , correspond to a readjustment of the 
forces and of the trees, during which the distances do not change. 

Initially we set the height of the rows and columns at at — 100 and (3j = 0. 
The corresponding distances jij are then obtained by complementing to 100 the 
values of table [l] and are shown in Fig. pi line a. The initial moving and fixed 



trees are set up as indicated in Section 4.2, Fig. H, and arc shown in line b 



All rows are moving and all columns are fixed, therefore all distances 7ij 
decrease. From line a, we immediately find that the moving tree can descend a 
distance d = 6; a contact is then made between row 3 and column 1. The new 
distances are shown in line c. 

We now readjust the forces and the trees. The main path is: stop Q — row 3 
— column 1 — stop P. The weakest link is between the column 1 and the stop 
P, with a force 44. Therefore the column 1 is captured by the moving tree. The 
forces along the main path change by A = ±44. The new trees and the new 
forces are represented in line d. 

All rows are still moving; in addition, column 1 is also moving. Therefore 
the distances 7™ remain fixed for j ' = 1 (first column of matrix) and decrease 
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for j 6 {2, 3, 4}. From line c we find then that the distance of descent is d = 30. 
Contact is made between row 1 and column 2. The new distances are shown in 
line e. 

Column 2 is captured by the moving tree and we obtain line f. Now the 
distances decrease for j £ {3,4}. After a descent of d = 2, contact is made 
between row 1 and column 4. The new distances are shown in line g. 

This time the weakest link is between Q and row 1. Therefore row 1, and 
its daughter the column 2, are captured by the fixed tree. The new trees are 
represented in line h. 

Now only the distances 7y with i 6 {2,3} and j £ {2,3,4} are decreasing. 
Therefore we have a descent of d = 10. Note that 711 is increasing, since row 
1 is fixed and column 1 is moving. The other distances remain fixed. The new 
distances are shown in line i. Contact is made between row 3 and column 4. 

The evolution continues. In line i, the column 4 and its two descendants 
are captured by the moving tree. In line k, we observe a more complex event, 
involving the capture of a large piece of the main path and a drastic reorgani- 
zation of the trees. Finally, in line m the last remnant of the moving tree is 
captured. 

We reach line n, where only the fixed tree remains. The forces along the 
edges of that tree give the solution of the transportation problem. They can be 
rewritten in matrix form (Table 0). 
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74 


36 


67 


38 


a 


37 


73 


87 


86 




6 


96 


96 


48 






68 


30 


61 


32 


c 


31 


67 


81 


80 







90 


90 


42 



d = 30 n 






68 





31 


2 


e 


31 


37 


51 


50 







60 


60 


12 



d = 2 






Figure 6: Excmplc. Left: the distances %j = oti — (3j — between the 
and the studs. Right: moving tree and fixed tree. 
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29 





31 


37 


49 


48 





60 


58 


10 



d= 10 




78 





29 





31 


27 


39 


38 





50 


48 






d= 29 




78 











31 


27 


10 


38 





50 
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Figure 6 (continued). 
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Figure 6 (continued). 
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Table 2: Solution of the transportation problem denned by Table [l]. 






44 



52 9 25 
4 
12 



6 Formal definition and justification of the al- 
gorithm 

6.1 Definitions 

In the present Section, we give a more rigorous definition of the numerical 
algorithm, and we prove that it solves the transportation problem. We also 
drop the restrictions introduced in Section |]: assumptions [l] and |^ do not have 
to be satisfied any more, i.e. relations of the form ( |32| ) and (^) are allowed. 

This Section is independent of the description of the mechanical model in 
Section pi and also of the informal description of the numerical algorithm in 
Section ET. It will be sometimes convenient to use names which are reminiscent 
of the origin of the algorithm, such as "descent" or "force"; the reasoning, 
however, will be purely mathematical. 

In what follows, unless otherwise specified, i is always understood to take 
all values from 1 to to and j to take all values from 1 to n. 

We want to solve a transportation problem defined by given to, n, a^, bj, Cij 
(Section ||). The algorithm operates on the following collection of objects: 

• A graph with m + n + 2 nodes labelled A±, . . . , A m , B%, ... , B n , P, Q. 

This set of nodes remains invariant during the course of the computation. 
On the other hand, the set of edges varies. 

• To each Ai node is associated a variable number a,. To each Bj node is 
similarly associated a variable number (3j . 

• To each edge is also associated a variable number, which will be called a 
force. 

6.2 Properties 

In the following Section, we will define the operation of the algorithm; simulta- 
neously, we will prove that the following properties hold throughout the com- 
putation. 



21 



PI Only the following kinds of edges are allowed: between an Ai and a Bj, 
between an Ai and Q, and between P and a Bj. 

It follows that the graph is bipartite, the two subsets of nodes being 
{A 1 ,...,A m ,P} and {Si,..., B n ,Q}. 

P2 The graph is a forest, consisting of one or two trees. 

P3 When there are two trees, one of them contains P and at least one other 
node. The other tree contains Q and at least one other node. They will 
be called respectively Tf or fixed tree and T rn or moving tree. 

P4 jij > 0. (-fij is the distance defined by (|35|)). 

P5 If there is an edge between Ai and Bj, then 7.^ = 0. 

P6 Forces are positive or zero. 

P7 The sum of the forces on the edges adjacent to node Ai equals a^. The sum 
of the forces on the edges adjacent to node Bj equals bj. 



6.3 Algorithm 

The algorithm consists in a succession of steps, described in the following Sec- 
tions. Step 1 is executed only once. Then a main loop, made of steps 2 to 5, is 
executed a number of times; each execution will be called a cycle. 



Step 1: Initialize 

We set up the initial state of the graph and of the associated values as follows. 
An edge is established between Q and each of the Ai, with associated force ai, 
and between P and each of the Bj, with associated force bj (see Fig. |J). We 
set ai = c sup , with c sup satisfying (0), and j3j = 0. It is easily verified that 
properties PI to P7 hold. There are two trees. 



Step 2: Descent 

Since there are two trees, property P3 ensures that there is at least one node 
in the moving tree other than Q. Since Q can be connected only to Ai nodes, 
the moving tree includes at least one Ai node. Similarly, the fixed tree includes 
at least one Bj node. Therefore the following minimum exists and can be 
computed: 

d = min 7^. (36) 

Ai£T m ,Bj£Tf J 

From property P4 we have: d > 0. 
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Next we effect the "descent of the moving tree" : 

ai := ai — d for all Ai £ T m , 

fa := I3 3 - d for all Bj £ T m . (37) 

Note that d may be zero, in which case nothing changes. 

We verify now that the properties still hold. Only the ai and flj have 
changed, therefore we have only to examine properties P4 and P5. We consider 
first P4. If ai and (3j belong to the same tree, 7^ does not change. If Ai 6 Tj 
and Bj € T m , 7^ increases. Finally, if Ai 6 T m and S T^, 7^ decreases by 
d, but remains positive or zero as a consequence of (p6[). 

We verify also P5: if there is an edge between Ai and Bj, these nodes belong 
to the same tree, and therefore 7y does not change. 

Step 3: Contact 

We consider the pair of values i = i c , j — j c which realized the minimum d in 
step 2, i.e. which were such that Ai c £ T m , Bj c £ Tf, and r yi c j c = d before 
the descent. (If more than one pair (i, j) realized the minimum, we select one 
of them arbitrarily). From (^) we find that there is now, after the descent: 

lioic = 0- 

We add one edge between nodes Ai c and Bj c , and we set the associated force 
equal to zero. 

We consider the properties. PI is still satisfied since the new edge is between 
an Ai and a Bj node. Concerning P2, since we have linked one nodes of T m 
with one node of T/, the graph now consists of a single tree. P3 does not apply 
any more. P4 is not affected by a change in the graph. P5 is satisfied for the 
new edge since r fi c j e — 0. P6 is satisfied since the new force is zero. Finally, P7 
still holds for nodes Ai c andB Jc , again because the new force is zero. 

Step 4: Readjustment 

We define the main path as the oriented path from Q to P. This path is unique 
since the graph consists of a single tree. The main path is made of three parts: 
(i) the path from Q to Ai c in the previous moving tree; (ii) the newly created 
edge from Ai c to Bj c ; (iii) the path from Bj c to P in the previous fixed tree. 
From property PI, we deduce that the first part has an odd number of edges 
(the graph is bipartite, and Q and Ai belong to different subsets). Similarly, the 
last part has an odd number of edges. Thus, the main path as a whole has an 
odd number of edges, which is at least equal to 3. We number the edges along 
the main path, from Q to P, starting from 1. We note that the two end edges, 
adjacent to Q and P, are odd-numbered, and that the newly created edge is 
even- numbered. 

We compute the minimum A of the forces associated with the odd-numbered 
edges. There is A > by virtue of property P6. 
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We readjust the forces along the main path, by adding A to the forces asso- 
ciated with even-numbered edges and subtracting A from the forces associated 
with odd- numbered edges. 

Only the forces have changed, therefore we have only to examine properties 
P6 and P7. Property P6 is obviously still true. For every node Ai or Bj along 
the main path, the changes of the forces associated with the two adjacent edges 
on the main path cancel each other, so that P7 remains true. 

Step 5: Breaking 

We consider the odd-numbered edge of the main path which realized the min- 
imum in the previous step. (If more than one edge realized the minimum, we 
select one of them arbitrarily). We will call it the breaking edge. The force 
associated with that edge is now zero. 

We delete the breaking edge. This completes one cycle of the algorithm. 

Properties PI, P5, P6 are not affected since we have simply removed an 
edge. The graph consists now again of two trees: P2 is satisfied. Property P4 
is not affected by a change in the graph. Property P7 still holds for the two 
nodes adjacent to the deleted edge since the force associated with that edge was 
zero. 

There remains to consider property P3. Since the deleted edge was on a 
path from Q to P, it is still true that one of the new trees contains Q and the 
other contains P. However it can happen that Q or P is now an isolated node. 
We distinguish two cases. 

1. The moving tree contains nodes other than Q, and the fixed tree contains 
nodes other than P. Property P3 is satisfied. We go back to step 2 for a 
new cycle. 

2. The moving tree contains Q alone, or the fixed tree contains P alone. This 
signals the end of the computation. 

It can be shown that case ^ necessarily happens after a finite number of 
steps: a definite upper bound on the number of cycles can be computed (see 
Appendix |A|) . Thus the algorithm always terminates. 

6.4 Solution 

We show now that a solution of the transportation problem has been obtained. 
This is similar to the proof given at the end of Section ^| the present derivation, 
however, makes no reference to the mechanical model. 

We will describe only the case where the moving tree contains Q alone; the 
other case, where the fixed tree contains P alone, is treated in the same way, 
exchanging rows and columns. We consider the fixed tree. It contains all nodes 
Ai and Bj, in addition to P. We define /y as follows: if there is an edge 
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between Ai and Bj, fij is equal to the associated force; otherwise fij = 0. Wc 
also define f*j as follows: if there is an edge between P and Bj, /*j is equal to 
the associated force; otherwise f*j = 0. 

By virtue of properties PI and P7, we have 

^2 fij = ai, (i = l,...,m), (38) 

3 

Ytfii+Ui = (j = l,...,n). (39) 

i 

Summing these two equations over i and j respectively and combining with (|I]) , 
we obtain 

EiW = °- (40) 

3 

From property P6 it follows that 

f*j = (j = l,...,n). (41) 

The forces are equal to zero on all edges adjacent to P. Therefore fij satisfies 
the constraints (||) to (|J): it is a feasible solution. 
From property P5, we derive 

fain = 0. (42) 

Summing over i and j, we obtain 

E "■<>, ~ E h oPj EE ^' c « = °- ( 43 ) 

i j i j 

Consider another feasible solution f--. From (||) and property P4 we have 

f'ijlij > (44) 
and therefore, summing over i and j and using (^) and (^) : 

E - E hfr -EE /'/•<./ > °- ( 45 ) 

i j i j 

Comparing with ((43|), we have 

EE^i^EE/^i ( 46 ) 

i j i j 

which shows that /y is an optimal solution. 
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7 Notes on practical implementation 



7.1 Dropping rows one by one 

Experience showed that the following modification of the algorithm results in an 
important reduction in computing time (typically a factor 3 for m — n = 100). 
The stop Q is cut into m independent pieces Qi, Q m , each supporting 
one row, and these stops are lowered one by one, each time waiting until an 
equilibrium has been reached before starting the next stop. This can be done 
by a simple modification of the algorithm described in Section ^]. 

7.2 General organization 

Measurements show that most of the computing time is spent in the descent 
phase, and specifically in finding the minimum d in (|3^). On the other hand, 
most of the complexity of the program lies in updating the structure of the trees 
and the associated information. Therefore only the computation of d needs to 
be optimized for speed. In the remainder of the program, one can freely use the 
structures which allow the easiest, most natural, and most legible representation. 

Experience shows that triply linked trees (j^, Section 2.3.3]) are a convenient 
structure. To each row are associated three pointers to its mother, its eldest 
daughter, and its next younger brother. Similarly, to each column are associated 
three pointers to its father, its eldest son, and its next younger sister. The first 
pointer is used to move upwards in the tree, for instance in order to determine 
the main path after contact has been made. The two other pointers are used 
to explore a branch, for instance in order to change its status from moving to 
fixed, or conversely, after a capture. 

Each row has seven quantities associated with it: its weight (which does 
not change during the computation); its height on; the three pointers; the force 
of contact with its mother; and a flag indicating whether the row belongs to 
the moving tree or to the fixed tree. Each column has seven similar associated 
quantities. 

7.3 Methods for descent 

Finding the minimum d defined by (|36| ) would seem to be a trivial task, involving 
two loops over i and j and about 15 lines of code. This would require a comput- 
ing time of order run for each descent. In fact, there is room for considerable 
improvement over this simple scheme. 

7.3.1 Version A 

First we observe that our task is to find the minimum among quantities 7^ 
which are positive or zero. Therefore if, during the examination of the 7^ , we 
find one which is zero, then we know that we have found the minimum and we 
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can end the search immediately. We will refer to the program incorporating this 
simple device as Version A. 

This is especially effective when the Cij take only a small number of integer 
values. We will refer to this as the degenerate case (see below, Section B.2). The 
distances 7y then take themselves only a small number of distinct values. For 
large m and n, each of these values appears many times. In particular, as soon 
as the algorithm is in progress, the value 7-y = typically appears many times. 
Therefore the search can be discontinued at an early time and the computing- 
time is much less than 0{mn). 

When this method is used, experience shows that it is advisable to start the 
search at a variable point in the 7^ matrix. If the search is always started from 
the beginning, the contact edge tends to be always selected from the same part 
of the matrix; an unbalanced situation develops, and computing time increases. 
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One method consists in choosing the starting point at random in the matrix. 
This has the disadvantage of requiring the use of a random number generator. 
A better solution (suggested by A. Noullez) is to compute the rank r' of each 
new starting point from the rank r of the previous one by a simple formula, 
such as 

r = r + \_Kmn\ (mod mn), (47) 

where [•] denotes the integer part. 

The most uniform distribution of points is obtained by choosing K = (y/5 — 
l)/2, the inverse of the golden ratio. This was found to give very good results. 

7.3.2 Version B 

Another line of attack is based on the realization that the structure of the system 
changes only partially from one cycle to the next. Therefore we can try to save 
and re-use information on the distances. (This approach was inspired by a study 
of the LSAP algorithm presented in ^ chap. 1] for the particular case of the 
assignment problem.) In particular we may try to take advantage of the tree 
structure which pervades the algorithm, noting that much of that structure is 
left intact in a capture episode. We will refer to the program developed along 
these lines as Version B. 

Several methods were tried. We describe here the method which gave the 
best results , and which is incorporated in Version B. We define a male branch 
Ai as the branch (of the moving or fixed tree) whose head is row Ai. (Note 
that here is a one-to-one correspondence between rows and male branches). For 
every pair we find the minimal distance Sij between the rows belonging 

to the male branch Ai and the column Bj, and we note for which row this 
minimal distance is realized. (Note that we consider here all male branches 
and all columns, irrespective of whether they are moving or fixed). When the 
structure of the trees changes, this information is updated. At the next descent, 
the updated information can then be used to compute quickly the minimum d: 
if the moving stop is Qi, the whole moving tree consists of the male branch Ai, 
and one has only to find the minimum among the stored distances 5ij between 
that male branch and the fixed columns. This takes a time 0(n). 

The updating of the distances Sij is somewhat complex. All male branches 
which have their head on the main path need to be reconsidered. The cases of 
capture by the fixed tree and by the moving tree require different treatments. 
Also the three pieces of the main path determined by the contact edge and the 
breaking edge have to be treated separately. Savings in computing time are 
achieved by looking for cases where it is not necessary to recompute the S^ . 

Version B is more complex than Version A. It also requires about twice as 
much memory, since the Sij array must be saved in addition to the given 
array. However, it is definitely faster in the general, non-degenerate case (see 
below, Section [Tl]). 
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8 Tests 



Numerical tests were performed to verify the correctness of the algorithm and to 

measure its performance. The computations were made on a Hewlett-Packard 

Apollo Series 700, Model 720 workstation. 

Comparisons were made with the subroutine H03ABF of the NAG library 
- in au computed cases (which number in the thousands) it was verified 

that exactly the same optimal cost @) is found with the present algorithm and 

with the NAG algorithm. 

The values dj and bj are taken as positive integers. They are first chosen at 

random in the intervals 

1 < a% < a max , 1 < bj < 6 max , (48) 

where o max and & max are two constants satisfying ma max = n6 max . Small ad- 
justments are then made in order to satisfy the relation (P exactly. 

Tests show that the computing time is insensitive to the values of a max 
and 6 max , provided that they are not too close to unity (A variation becomes 
detectable for values of 10 or less). In practice we take a max = 160000/m, 
Vax = 160000/n. 

The values c.^ are also taken as integers, randomly chosen in the interval 

1 Cij C max . (49) 

Again the computing time is found to be insensitive to the value of c max , pro- 
vided that it is large enough. Tests show that the relevant quantity is the ratio 

c max /rn\ 

c» = . (50) 



Variations of the computing time begin to be noticeable when c* is less than 
1 (see below Section |8.2| ). This corresponds to the onset of degeneracy: for 
c* <C 1, each value in the allowed range (|4^) appears many times in the cy 
matrix. Thus, the value taken for c max depends on whether the non-degenerate 
or the degenerate case is considered. 

For simplicity only the square case m — n was considered, with n ranging 
from 10 to 1000. Time measurements were generally averaged over a series of 
100 computations, in order to obtain more accurate values. 

8.1 Non-degenerate case 

We take c* = 1. Note that this corresponds to a case where each value in the 
allowed interval ( ff9| ) is present once on average in the matrix. 

Fig. [7] shows the computing time (divided by n 3 for better clarity) as a 
function of n, for three algorithms: 
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Crosses correspond to the subroutine H03ABF of the NAG library Q . The 
time appears to grow asymptotically as n 3 35 . (A curious discontinuity is 
present: the computing time jumps up suddenly by a factor of about 
2.2 between the values n = 94 and n = 95. This is probably due to 
peculiarities of the computer hardware.) 



Open circles represent Version A of the present algorithm (see Section 7.3) 
Computing time grows asymptotically as n 3 05 . 



• Filled circles represent Version B of the present algorithm (see Section 7.3). 
For low values of n, the computation is slower than with Version A because 
of the extra work involved in computing the distances Sij. Above n = 
100, however, this extra work begins to pay off. Computing time grows 
asymptotically as n 25 . 

Version B is clearly the best method. For a 1000 x 1000 problem, the NAG 
subroutine takes about 7000 seconds, while Version B takes about 110 seconds. 
The ratio increases for larger values of n. 



8.2 Degenerate case 

A value c max = 20 was chosen as representative for a degenerate problem. In 
particular, this is a typical value for applications to lattice gas problems 
Thus, Cij can take only integer values from 1 to 20. 

Fig. [S| shows computing times as a function of n, for two algorithms: the 
NAG subroutine and Version A of the present algorithm. (Version B is inefficient 
in the degenerate case and is not shown). 

• For the NAG subroutine (crosses), computing time is essentially the same 
as in the non-degenerate case (Fig. 0) up to about n = 100. For larger val- 
ues, the effect of the degeneracy begins to be felt and the slope decreases. 
Asymptotically, the computing time appears to grow as about n 2 . 

• For the present algorithm (open circles), the decrease in computing time 
with respect to the non-degenerate case is more marked and starts earlier, 
at about n = 20. The time dependence is more complex. The final slope 
indicates a dependence in n 1 - 65 . 

For a 1000 x 1000 problem, the NAG subroutine takes 400 seconds, while 
Version A takes about 1.25 seconds. The ratio again increases for larger values 
of n. 

We remark that an exponent of n less than 2 means that for large values of 
n, the time needed to solve the problem is small compared to the time needed 
to set it up, since simply copying the matrix into memory takes a time 
proportional to n 2 ! Note also that in this situation, most Cy values will never 
be used. 
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Figure 7: transportation problem: computing time (divided by n 3 ) as a function 
of size n. Crosses: NAG subroutine. Open circles: present algorithm, Version 
A. Filled circles: present algorithm, Version B. 
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Figure 8: Degenerate transportation problem: computing time (divided by n 3 ) 
as a function of size n. Crosses: NAG subroutine. Open circles: present algo- 
rithm, Version A. 



32 



8.3 Assignment problem 

Tests were also made for the particular case of the assignment problem, where 
m = n and all rods have weights dj = 1, bj = 1. (It is then easily shown that an 
optimal solution obtained with the present algorithm automatically satisfies the 
constraint (|io|). In that case, comparisons can also be made with the subroutine 
LSAP of Burkard and Derigs Q (noted BD here) . 

Fig. ^ compares computing times in the general (non-degenerate) case with 
c* = 1. As before, the NAG subroutine is much slower. The BD algorithm 
is fastest: for a 1000 x 1000 problem, computing time is about 45 seconds for 
Version B and 22 seconds for the BD algorithm. The difference decreases when 
n increases, however: the asymptotic law is about n for Version B, compared 
to n 2 - 7 for BD. 

Fig. [To] compares NAG, Version A, and BD for the degenerate assignment 
problem, with c max = 20. Here Version A is fastest: for a 1000 x 1000 problem, 
computing time is 0.6 seconds for Version A, and 14 seconds for BD. The dif- 
ference increases with n: the asymptotic behaviour is in n 1 ' 55 for Version A, n 2 
for BD and NAG. 

9 Final comments 

1. Many variations of the present algorithm could be imagined, and it is quite 
possible that some of them would increase its speed. For instance, sometimes 
an arbitrary choice can be made for the contact edge or the breaking edge (see 
steps 3 and 5 in Section ^); one might try to determine what is the best choice. 

2. As mentioned in the Introduction, the present algorithm was conceived 
and developed independently, without recourse to the existing litterature. Look- 
ing back, however, it becomes clear that some relations exist. The heights of 
the rods, for instance, are nothing else than the classical dual variables; hence 
the choice of the customary notations a.i and (3j in the present paper. 

In particular, after this work was completed, M. Hartmann called our at- 
tention to the references |l3| and 0] . These papers describe algorithms for the 
minimum cost flow problem, which includes the transportation problem as a 
special case. One starts from zero flow and continually increments it, main- 
taining at all times a minimum-cost solution, until the desired flow is attained. 
It appears that the algorithm of the present paper belongs essentially to the 
same family, known as parametric algorithms. The equivalent of the flow in the 
present model is the total force applied by the lines to the columns; as is easily 
shown, this total force starts from zero and increases with time, until it is equal 
to the total weight of the lines. Mo re s pecifically, it increases by A during each 
readjustment of the forces (Section |6~3|, Step 4). 

3. As mentioned in Section |7[ most of the computing time in the present 
algorithm is spent on finding the minimum in a large set of numbers. Efficient 
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Figure 9: Assignment problem: computing time (divided by n 3 ) as a function of 
size n. Crosses: NAG subroutine. Filled circles: present algorithm, Version B. 
Asterisks: Burkard and Derigs algorithm. 



34 



~ I 1 1 1 — I — I — I I I 

X 



T 1 1 1 1 1 — I— T 



x X 



X X x x X x 



NAG 



CO 

c 

o _ 
o . 
CD O 
CO 

o 
o 



* o 



o 

* o 
* . o 



* * * 



* * * * 
o 



* 

BD * 



to 



O - 



I 

o 



10 



_l I I I I I I 1_ 



_l I I I I I I L 



100 



1000 



n 



Figure 10: Degenerate assignment problem: computing time (divided by n 3 ) as 
a function of size n. Crosses: NAG subroutine. Open circles: present algorithm, 
Version A. Asterisks: Burkard and Derigs algorithm. 
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algorithms have been developed for this operation on massively parallel com- 
puters [|| . A parallel implementation of the algorithm might therefore be of 
interest. 

4. A comparison of figures |7] and ||, or [| and [HI shows that the solution of 
the degenerate problem is much faster. This might be of interest in situations 
where a great accuracy is not required, or is not present in the data. In such 
cases, it will be very advantageous to round off the values so as to reduce 
them to a comparatively small set of values. 

5. The idea of using mechanical analog computers for optimization problems 
is not new. For instance, reference jl4| describes a mechanical device made of 
shafts and gears, which can in principle solve the general instance of the linear 
programming problem. However, this device is introduced in |T^| only as a 
conceptual tool in a theoretical study of the complexity of analog computation; 
it is not intended as a model for a practical algorithm. We remark also that 
the mechanical model of the present paper is adapted to the special case of 
the transportation problem, and is therefore much simpler (and presumably 
more efficient) in that special case. As a rough measure, the present model has 
0(m + n) moving parts, while the model of |l4| would have 0(mn) moving parts 
in a m x n transportation problem. 

In this connection, it is natural to ask whether the mechanical model used 
here to simulate the transportation problem can be extended to the more general 
minimum cost flow problem, or to the even more general linear programming 
problem. We have not found any obvious way to do this. 

A Bounds on the maximal number of cycles 

We call Z(m, n) the maximal number of cycles, for the m x n problem, using 
the algorithm described in Section |[ We derive here some rigorous bounds on 
this number. 

A.l Upper bound 

We derive first a general upper bound for Z(m, n). 

We number with an index I the successive levels of the moving tree. The 
stop Q is at level I — 0, the sons of Q are at level 1=1, the grand-daughters 
of Q are at level I = 2, and so on. Note that odd levels correspond to rows and 
even levels to columns. We call gi the number of nodes of the tree at level I. 
The sequence of numbers gi, g2, ■ ■ ■ , will be called the signature of the moving 
tree. 

We consider now all possible signatures for given m and n, assuming that 
the final state has not yet been reached, i.e. that the moving tree still contains 
at least one row and the fixed tree still contains at least one column. We order 
these signatures as follows. First we sort by decreasing g\. Next we sort each 
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subset by increasing gi. Next we sort each subsubset (corresponding to given 
gi and 52) by decreasing 53; and so on, always using decreasing order for odd 
values of I and increasing order for even values. Finally, we number the sorted 

signatures with K = 1, 2, 

It is then easy to show that K always increases during a cycle. There are 
two cases: 

1. The moving tree captures a subtree from the fixed tree. The root of this 
subtree (after the capture) is a column. Therefore the first level / at which 
there is a change in the signature is even, and gi increases by one unit. 
From the above sorting method it follows that K increases. 

2. The moving tree loses a subtree. The root of this subtree (before the 
capture) is a row. Therefore the first level I at which there is a change in 
the signature is odd, and gi decreases by one unit. Again K increases. 

Therefore we obtain an upper bound on the number of cycles simply by 
counting the signatures. We call p = g% + g^ + . . . the number of rows, and 
q = gi + gi + . . . the number of columns in the moving tree. Since the moving 
tree is assumed to be non-empty, p can take values from 1 to m. Similarly, since 
the fixed tree is non-empty, q can take values from to n — 1. We evaluate first 
the number of signatures for given p and q. A signature can also be represented 
by a sequence of p + q binary digits: we write g\ digits 1, then g 2 digits 0, then 
33 digits 1, and so on. The first digit must be a 1. There are q digits 0, which 
can be placed anywhere in the remaining p + q — 1 positions. Therefore the 
number of possible signatures is 



Exactly the same considerations can be applied to the fixed tree. We number 
with I the successive levels. We call g[ the number of nodes at level I. The 
sequence of numbers g[, g' 2 , ■ . ■ , will be called the signature of the fixed tree. 
We sort the signatures, using decreasing order for odd I (columns) and increasing 
order for even I (rows). We number the sorted signatures with K' = 1,2,.... 
The number K' also always increases during a cycle. We call q' — g[ + g'^ + ■ ■ ■ 
the number of columns, and p' = g' 2 + g'4 + . . . the number of rows in the fixed 
tree. Counting the number of signatures as above, we find 




(51) 



Summing over p and q, we obtain the following upper bound for Z: 




(52) 




(53) 



i.e. exactly the same upper bound as in (p2). 
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A. 2 Better upper bound 

A much better upper bound can be obtained by noting that only some combi- 
nations of K and K' are permitted. This is because we must have 

p + p' = m, q + q' = n. (54) 

Thus, in a (K, K') plane, only a subset of points are allowed. Combining this 
with the fact that both K and K 1 must increase at each step, one can trace the 
possible paths in the plane and derive an upper limit on the number of steps, 
which we call Z'(m, n). 

Unfortunately a general formula giving Zg Up (m, n) for arbitrary m and n has 
not been found. Results obtained by a computer program for values of m and 
n up to 10 are listed in Table |^. 

Table 3: Upper limit Z' sup (m, n) on the number of cycles for 1 < m < 10, 
1 < n < 10. 
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10 


10 


20 


50 


94 


170 


284 


448 


682 


1022 


1534 



For m = n = 10, for instance, we have Z sup (m, n) = 184755, Z' sup (m, n) = 
1534. 

A. 3 Lower bound 

In the square case m = n, the following lower bound can be proved: 

Z{n,n) > Z in{ (n,n) = 3 x 2"" 1 - 2. (55) 

The proof is cumbersome and will not be given here. It consists in showing 
that the number of cycles equals Z ln [ in the following case: 

a\ = 1, h = 2, 
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a, = a,i-i + bi-i, b i =bi- 1 +a i , (i = 2, . . . , n - 1), 

a n = + b n -i, b n = 6 n _i, 

(n + 1 - i)(n + 1 - j) - n 2 {2 1 + V) if i = j, 
(n + l-i)(n + l-j) -n 2 (2-?') if z < j , (56) 

(n + l-i)(n + l- j) -n 2 (2 l ) if i > j. 

This was also verified by a direct application of the numerical algorithm for 
n = 1 to 17. Note that the sequence Oi, b\, 0,2, 62, ■ ■ ■ is the Fibonacci sequence, 
minus its first term. 

Comparing with the diagonal of Table ^, we find that the upper bound given 
in that Table is identical to the lower bound given by (p5|). Therefore, for n = 1 
to 10, we know the exact value of the maximal number of cycles, which is 

Z(n,ri) = 3x2 n - 1 -2, (57) 

and (]5q ) is a worst case, achieving this maximal value. There is a strong sug- 
gestion that (|57|) holds for all values of n, but this has not been proved. 



A. 4 Comparison with observed values 

We observe that (|5^) corresponds to a computing time which grows exponen- 
tially with n. Fortunately, numerical tests with randomly chosen examples show 
a much milder increase, which is approximately linear in n. Table ^ compares 
the values (|57| ) with the average observed values of the number of cycles, for 
Version B of the algorithm, in the non-degenerate case. The r.m.s. dispersions 
are also given; they show that individual values do not deviate much from the 
average. 



Table 4: Number of cycles in the "bad" case 
number of cycles (column 3). 



1) (column 2), and observed 



n 


Z in f(n,n) 


observed 


5 


46 


10 ± 1 


10 


1534 


23 ±2 


15 


49150 


37 ±3 


20 


1572862 


50 ±4 


25 


50331646 


64 ±5 


30 


1610612734 


80 ±6 


50 




141 ± 10 


100 




304 ± 17 



This considerable difference between the "bad case" (|56])and the average case 



can be probably understood by noting that (56) is a rather extreme case: the 
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coefficients di, bj, Cij form essentially geometrical progressions. For m = 1000, 
for instance, the ratio aiooo/ai is of the order of 10 400 . This is not likely to be 
encountered in applications. 
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B Addendum (September 2002) 



The present paper was submitted to Mathematical Programming in May 1992. 
The following referee's report was subsequently received: 

(Beginning of referee's report) 

General comments 

This well-written paper proposes an algorithm for the transportation problem 
that is motivated by an analog model, and has some comparisons of a code for 
this algorithm with a NAG code. The new algorithm is found to be much faster 
than the NAG code on randomly generated problems. 

The author of this paper comes from outside the Math Programming com- 
munity and is to be commended for taking the time to bring his/her fresh 
perspective to the subject. However, it is tempting to dismiss this paper on 
the basis that the author has merely re-discovered a known algorithm, namely 
the one in [13], noting that [13] already contains a computational comparison of 
such an algorithm with other algorithms existing at the time. A further reason- 
able criticism of the current paper is that the NAG routine "H03ABF uses the 
'stepping stone' method, modified to accept degenerate cases" (quoting from 
the NAG documentation), referenced in the book "Linear Programming" by 
G. Hadley, 1962 (this description should have been in the paper). The routine 
itself has been in the NAG library at least from 1975. Thus this code does not 
represent the current state of the art in transportation algorithms, as would 
be found, e.g., in Ahuja et al. Also, comparing codes on randomly generated 
problems can be misleading in any case. 

On the other hand, I believe that in general such outside contributions should 
be valued, and specifically that this paper has something to offer if it is drasti- 
cally re- written: 

1. The application of transportation problems to lattice gas models is in- 
triguing, and is worth further discussion. Is there a short way to say how 
such models arise and the significance of the transportation subproblem? 
Also, do such problems tend to be sparse or dense, how big are m and n 
in typical problems, how large do the supplies and demands tend to be, 
and how large do the costs tend to be? These are all parameters that are 
important in assessing which of the standard modern algorithms might 
work well on such problems. 

2. The computational testing in the paper needs to be fixed. It would be more 
believable if the codes were tested on instances arising from the actual 
application rather than random problems. It would also be a service to the 
community to release several typical examples of such problems to, e.g., 
the DIMACS library of network flow instances, so that other codes can 
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be tried on these problems. DIMACS also has available some well-tested 
random network generators; it enhances comparability between codes if 
standard generators are used instead of ad hoc generators. DIMACS codes 
and generators are available via anonymous ftp from dimacs.rutgers.edu 
in directory pub/netflow. 

3. I enjoyed the analog model and the physical insight it gives to the al- 
gorithm. In fact, I believe that a slightly different presentation of the 
algorithm that tics it more closely to the physical model would improve 
the paper: My suggestion is to make P and Q into an extra row and 
column (which they effectively are in the model anyway), both with zero 
weight/buoyancy, and such that cpj — CiQ — for all i and j. However, to 
act as stops in the initial configuration, we must give P an artificial weight 
of ^bj, and Q an initial buoyancy of X) a «- The initial configuration is 
in fact optimal with these artificial weights. The aim of the algorithm is 
then to decrease the artificial weights to zero while maintaining optimality 
at each intermediate artificial weight value. Thinking of P and Q as lines 
remove the need to treat them as special cases elsewhere. 

4. Much of the proof of the algorithm's correctness is unnecessary since the 
author is just rediscovering well-known arguments, and has proposed an 
algorithm that fits nicely into known classes of algorithms. For example, 
equation (25) is known as complementary slackness, and the fact that 
complementary slackness plus feasibility equals optimality is so well-known 
that it can be stated without a reference. Although the author derived 
the algorithm independently of known algorithms, in order to effectively 
present it to an audience which is familiar with known algorithms it would 
be helpful to discuss the algorithm as if it were a special case of what is 
called "dual node-infeasible" simplex algorithms in [13]. I have in mind 
something like the following (assuming that suggestion 3 above is taken): 
The algorithm maintains dual feasible variables (the heights) and primal 
flows (forces; this correspondence between heights/forces and dual/primal 
variables is too important to leave it to a comment in the conclusion) that 
satisfy non-negativity and all supplies and demands except possibly at 
the extra nodes P and Q, and ensures that the primal and dual variables 
satisfy complementary slackness. It also maintains a basic tree, namely 
the fixed tree plus the moving tree, plus the extra arc where fresh contact 
between the fixed and moving trees occurs. The extra arc allows some 
of the surplus supply at P to be pushed through the tree to cancel out 
some of the surplus demand at Q. An arc whose flow drops to zero during 
this flow push can then be the dual simplex outarc, and a standard dual 
simplex pivot (whose two sides will be the fixed tree and the moving tree) 
will determine which is the new extra inarc to be added to the tree. This 
continues until there is no surplus supply or demand at P or Q. This 
change would allow Section 6 to be reduced to a few sentences in Section 
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4, since it is well-understood that node-infeasible dual network simplex 
maintains dual feasibility, primal feasibility except for conservation, and 
complementary slackness, and that when conservation is achieved we must 
be optimal. Without this change and its attendant severe shortening, 
the paper is not publishable since large parts of it recapitulate familiar 
arguments. 

5. The Appendix should be shortened and moved into the text to establish 
what is known about how fast the algorithm converges. All that is needed 
is the observation that the algorithm is always finite without the need for 
any anti-degeneracy device (since signatures are lexicographically strictly 



n x n examples exist which use 0(2") pivots. It should be pointed out 
that each piece of data in these n x n examples has only 0(n) bits, so the 
examples show that the algorithm is not even weakly polynomial. Also, 
either "iterations" or "pivots" is preferable to the word "cycles", since 
"cycles" is too suggestive of cycles in graphs. 

6. It seems possible that by using scaling (see, e.g., Ahuja et al.), a weakly 
polynomial version of the algorithm could be developed. The rough idea 
would be to scale the supplies and demands. Although in general this 
would lead to a problem where ^ a-i ¥"^2^, the extra nodes would hand- 
ily absorb the extra flow. When we optimize at one scale factor and want 
to move to the next one, supply or demand at a node might increase by 
one, and we need to regain optimality. I believe that we can do this by 
a shortest path computation at each such node, looking for the shortest 
distance path that will allow us to rehang the extra flow from the proper 
one of P or Q. This can be done in polynomial time. We then have a 
problem where the total surplus supply/demand at P and Q is 0(m + n). 
Degeneracy could cause a problem here since it makes it difficult to get 
a polynomial bound on iterations before optimality. I believe that the 
same shortest path trick can be used to ensure that each iteration moves 
at least one unit of flow. This shortest path business compresses several 
iterations into a single iteration, and might be useful in general. (Ideally, 
it would be nice to see a computational comparison of the shortest path 
version with the tree version of the algorithm on degenerate problems.) In 
addition, the shortest path neatly ties in with the notion of the reduced 
costs as "distances" , and is defensible in terms of the model as looking for 
the smallest distance to move the moving tree so that some of the force 
on P and Q can be lessened. This form of the algorithm starts to look 
very much like the well-known successive shortest paths algorithm (see 
Ahuja et al.). In any case, a mention of a possible scaling version of the 
algorithm would be useful. 



increasing), that the upper bound 
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7. Most readers will prefer the term "transportation problem" to "Hitchcock 
problem" . 
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(End of referee's report) 

Incidentally, I wish to thank here the unknown referee for taking the trouble 
to write such a detailed and helpful report. 

It took me some time to obtain the necessary papers, to assimilate them 
and to re-do the calculations. In the end, I was able to convince myself that, 
unfortunately, the referee was entirely correct: 

(i) The algorithm is not new. It is in fact identical, apart from some trivial 
changes, with the "parametric network simplex algorithm" described for in- 
stance in the book "Network flows" , published in 1993 by Ahuja, Magnanti and 
Orlin, pages 433-437. 

(ii) The NAG algorithm which I used as a basis for comparison is not state 
of the art. I made comparisons with more modern algorithms, specifically with 
the results of Jianxiu Hao and George Kocur in the paper "An implementation 
of a shortest augmenting path algorithm for the assignment problem" (found on 
the Internet server dimacs.rutgers.edu, dated 1992), and my program does 
not show a marked advantage anymore. 

Unfortunately I did not feel able to follow the suggestion of the referee and to 
drastically rewrite the paper; it would have been too much work, in a field with 
which I am not very familiar. So I published a much shorter version, including 
only the mechanical model (Section 3 of the present paper) in Comptes Rendus 
de VAcademie des Sciences, Paris, 321, Serie I, 741-745 (1995). 

Recently, some of my colleagues have used ideas from the present paper 
to solve a problem in cosmology (see "A reconstruction of the initial condi- 
tions of the Universe by optimal mass transportation" , by Uriel Frisch, Sabino 
Matarrese, Roya M ohayaee, Andrei Sobolevski, Nature, 417, 260-262 (2002) = 
|a,stro-ph/0109483| ), and they asked me to make the paper generally available by 
submitting it to arXiv. 

The present text is unchanged from the 1992 original, with two exceptions: 
a minor error has been corrected in Equ. (47), and the expression Hitchcock 
problem has been replaced by the more modern name transportation problem, 
as suggested by the referee. 
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